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Abstract 

This paper introduces the Egison programming language whose feature 
is strong pattern-matching facility against not only algebraic data types 
but also non-free data types whose data have multiple ways of representa¬ 
tion such as sets and graphs. Our language supports multiple occurrences 
of the same variables in a pattern, multiple results of pattern-matching, 
polymorphism of pattern-constructors and loop-patterns, patterns that 
contain “and-so-forth” whose repeat count can be changed by the param¬ 
eter. This paper proposes the way to design expressions that have all 
these features and demonstrates how these features are useful to express 
programs concise. Egison has already implemented in Haskell. 


1 Introduction 

Data types are called free if syntactically distinct terms are unequal. For ex¬ 
ample, lists are a free data type when we construct them with the nil and cons 
constructors. When the join constructor that makes a list by appending two 
lists is introduced, lists are not free, but non-free, since there are multiple ways 
to split a list. On the other hand, multisets and sets are always non-free data 
types. This is because there are no way to provide a set of constructors to make 
them free since they ignore the order of the elements. 

Non-free data types often appear in expressing algorithms. Consequently, a 
natural way to handle them is really important. Without it, we need to translate 
and regard them as a free data type whose data have a standard form when we 
treat them. For example, a set would be treated as a list. In many cases, 
verbose nested loops and conditional branches occur because of this translation. 
For example, when we match identical pairs in a collection, we need to write 
nested loops and conditional branches. 

We have designed a new pattern-matching system to treat non-free data 
types directly. We have implemented it in our new programming language 
Egison [Egi( 2 0li-2015)| using Haskell. In this paper, we demonstrate this our 
new pattern-matching system and show examples to write programs utilizing 
this system. 


1 




2 Demonstrations 


At first, let us introduce the overview of our language. Our language is a 
functional programming language with lazy evaluation strategy and has paren¬ 
thesized syntax as Lisp. 

{top-expr) ::= ‘(define’ ( pat-var) {expr ) *)’ (top level binding) 

(expr) 


{pat-var) :: = ‘$’ ( ident) 


(pattern-variable) 


{expr) ::= ( constant) 

{ident) 

‘<’ {Ident) {expr)* T 
| T (ezpr)* T 
T (expr)* T 

‘(lambda [’ {pat-var)* ‘] ’ {expr) ‘)’ 
‘(match-all’ {expr) {expr) {match-clause) ‘)’ 
‘(match’ {expr) {expr) {match-clause)* *})’ 
{matcher-expr) 


(constant) 
(variable) 
(algebraic data) 
(tuple) 
(collection) 
(function) 
(match-all expression) 
(match expression) 
(matcher expression) 


{match-clause) ::= ‘ [’ {pattern) {expr) ‘] ’ 


(match clause) 


{pattern) ‘_’ (wildcard) 

{ident) (pattern-variable) 

\ ‘{expr) (value-pattern) 

“<’ {ident) {pattern)* ‘>’ (inductive-pattern) 

‘(loop’ {pat-var) ‘ [’ {expr) {expr) ‘] ’ {pattern) {pattern) “)’ (loop-pattern) 

ident and Ident stand for an identifier that begin with a lowercase letter and 
an uppercase letter, respectively, match-all and match expressions are syntax 
for pattern-matching, the core of this paper. We explain them in detail from 
the next section, matcher expressions are used to define how to pattern-match 
for each data type. In this paper, we focus on the demonstration of our pattern¬ 
matching expressions and do not get into the mechanism behind and matcher 
expressions. 


2.1 Pattern-Matching with Backtracking 

The following is syntax of match-all expressions. A match-all expression is 
composed of a target , matcher and match-clause, which consists of a pattern 
and body expression. A match-all expression evaluates the body of the match- 
clause for each pattern-matching result and returns the collection that contains 
all results. A matcher specifies the way to match the target with the pattern. 

{match-all-expr) ::= ‘(match-all ’ {tgt-expr) {matcher-expr) {match-clause) “)’ 
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Here is the first demonstration of Egison. The only difference among the 
following three expressions is its matcher, list, multiset and set are prede¬ 
fined functions in Egison core library. They are functions that obtain a matcher 
and return a matcher. For example, (list integer) is a matcher for a list 
of integers, (set (multiset integer)) is a matcher for a set of multisets of 
integers, integer is a predefined matcher to pattern-match integers in Egison 
core library. 

> (match-all {1 2 3} (list integer) [<cons $x $ts> [x ts]]) 

{[1 {2 3}] 1 

> (match-all {1 2 3} (multiset integer) [<cons $x $ts> [x ts]]) 

{[1 {2 31] [2 {1 3}] [3 {1 2}]} 

> (match-all {1 2 31 (set integer) [<cons $x $ts> [x ts]]) 

{[1 {1 2 31] [2 {1 2 31] [3 {1 2 3}]} 

In the above expressions, <cons $x $ts> is a pattern, cons is a pattern- 
constructor. The name of a pattern-constructor starts with lowercase. It divides 
a collection into a head element and the rest. The meaning of a head differs 
for each matcher. ‘$x’ and ‘$ts’ are called pattern-variables. We can access the 
result of pattern-matching referring to them. 

The characteristic of our pattern-matching expression is it takes a matcher. 
It realizes polymorphism of pattern-constructors and enables us to use the same 
pattern-constructors for similar data types. We specifies a matcher in pattern- 
match expressions, because data of non-free data types such as a collection can 
be pattern-matched as different data types in many places of programs. 

We introduce other pattern-constructors nil and join. The nil pattern- 
constructor takes no arguments and matches when the target is an empty collec¬ 
tion. The join pattern-constructor takes two arguments and divides a collection 
into two collections. The following is a demonstration of join. 

> (match-all {1 2 3} (list integer) [<join $xs $ys> [xs ys]]) 

{[{} {1 2 3>] [{1} {2 3}] [{1 2} {3}] [{1 2 3} {}]} 

Finally, we can handle pattern-matching that has even infinite results. 

> (take 8 (match-all nats (set integer) [<cons $m <cons $n _>> [m n]])) 

{[1 1] [1 2] [2 1] [1 3] [2 2] [3 1] [1 4] [2 3]} 

take is a function that obtains a number n and a collection xs and returns 
the first n elements of xs. nats is an infinite list that contains all natural 
numbers. is an wildcard and matches with any object. Our pattern-matching 
system guarantees to enumerate all successful matching results. The idea of the 
traverse strategy in the pattern-matching process is similar with the idea of this 
paper [Sp ivey(2000)] . In brief, we adopt breadth-first search for the traverse 
strategy. 

2.2 Non-Linear Pattern-Matching 

Non-linear patterns are patterns that allow multiple occurrences of same vari¬ 
ables in a pattern. The following is an example of a non-linear pattern. The 
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output of this example is the collection of numbers from which three number 
sequence starts. 


> (match-all {15624} (multiset integer) 

[<cons $n <cons ,(+ n 1) <cons ,(+ n 2) _>>> n]) 

{4} 


A pattern is examined from left to right in order, and the binding to a 
pattern-variable can be referred to in its right side of the pattern. In this exam¬ 
ple, at first, the pattern-variable ‘$n’ is bound to any element of the collection 
since the matcher is (multiset integer). After that, the value-pattern ‘, (+ n 
1) ’ and ‘, (+ n 2) ’ are examined. A value-pattern begins with ‘, ’. The expres¬ 
sion following ‘, ’ can be any kind of expressions. In this case, the value-patterns 
match with a target if the target object is equal with the content of the pattern. 
Therefore, after successful pattern-matching, ‘$n’ is bound to an element from 
which three number sequence starts. 

How to handle value-patterns is defined in matchers, and then varies by 
matchers. For example, the way to check equality for integers is defined in the 
integer matcher. It realizes polymorphism of value-patterns. For example, the 
way to check equality for lists and multisets are different, but we can use value- 
patterns for both of them as we can use same pattern-constructors for them. 
The advantage of the value-pattern notation over guard is it enables us to read 
a pattern with the same order of the execution process of pattern-matching. 

Let us show another demonstration of non-linear patterns. It enumerates all 
twin primes by pattern-matching against the infinite list of prime numbers. 

> (define $twin-primes 

(match-all primes (list integer) 

[<join _ <cons $p <cons ,(+ p 2) _>>> [p (+ p 2)]])) 

> (take 6 twin-primes) 

{[3 5] [5 7] [11 13] [17 19] [29 31] [41 43]} 

We can write pattern-matching against nested non-free data types such as 
a list of multisets or a set of sets as follow. 

> (match-all {{1 2 3 4 5} {4 5 1} {6 1 7 4}} (list (multiset integer)) 

[<cons <cons $n _> <cons <cons ,n _> <cons <cons ,n _> <nil>>>> n]) 

{1 4} 

Our language has match expressions as other functional languages. A match 
expression takes multiple match-clauses and tries pattern-matching for each 
pattern from the head of match-clauses. A match expression is useful to express 
conditional branches. 

( match-expr) ::= ‘(match’ ( tgt-expr) (matcher-expr) ‘{’ ( match-clause )* l })’ 

The following is a demonstration that determines poker-hands. Note that 
all poker-hands are represented in a single pattern. The card matcher is defined 
in the same way with algebraic data types of the existing functional languages. 
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(define $poker-hands 
(lambda [$cs] 

(match cs (multiset card) 

{[<cons <card $s $n> 

<cons <card ,s ,(- n 1)> 
<cons <card ,s ,(- n 2)> 
<cons <card ,s ,(- n 3)> 
<cons <card ,s ,(- n 4)> 
<nil»»» 
<Straight-Flush>] 

[<cons <card _ $n> 

<cons <card _ ,n> 

<cons <card _ ,n> 

<cons <card _ ,n> 

<cons 

<nil»»» 

<Four-of-Kind>] 

[<cons <card _ $m> 

<cons <card _ ,m> 

<cons <card _ ,m> 

<cons <card _ $n> 

<cons <card _ ,n> 
<nil»»» 
<Full-House>] 

[<cons <card $s _> 

<cons <card ,s _> 

<cons <card ,s _> 

<cons <card ,s _> 

<cons <card ,s _> 
<nil»»» 

<Flush>] 

[<cons <card _ $n> 

<cons <card ,(- n 1)> 


<cons <card _ ,(- n 2)> 
<cons <card _ ,(- n 3)> 
<cons <card _ ,(- n 4)> 
<nil»»» 

<Straight>] 

[<cons <card _ $n> 

<cons <card _ ,n> 

<cons <card _ ,n> 

<cons 

<cons 

<nil»»» 

<Three-of-Kind>] 

[<cons <card _ $m> 

<cons <card _ ,m> 

<cons <card _ $n> 

<cons <card _ ,n> 

<cons 

<nil»»» 

<Two-Pair>] 

[<cons <card _ $n> 

<cons <card _ ,n> 

<cons 

<cons 

<cons 

<nil»»» 

<One-Pair>] 

[<cons 

<cons 

<cons 

<cons 

<cons 

<nil»»» 

<Nothing>] }■) ) ) 


2.3 Loop Patterns 

Let us consider a function comb2 that takes a collection returns the 2-combinations 
of the elements. The function is written using pattern-matching as follow, 
something is a only built-in matcher and it can be used only for pattern¬ 
matching with a wildcard or a pattern variable. 

> (define $comb2 (lambda [$xs] 

(match-all xs (list something) 

[<join _ <cons $a_l <join _ <cons $a_2 _>>>> {a_l a_2}]))) 

> (comb2 {123 4}) 

{{1 2} {1 3} {2 3} {1 4} {2 4} {3 4}} 

Now, we explain indexed-variables. A variable whose name is followed by 
and an expression is an indexed-variable. The expression after must be 
evaluated to a natural number and is called an index. We can append as many 
indexes as we want. 

Next, let us consider comb3, comb4, comb5, and so on. Patterns in these 
combX have the same form, <join _<cons $a_l <join _<cons $a_2 . . . 
_>. . .>>>. It seems to be possible to generalize them. A loop-pattern is for 
such a purpose. A loop-pattern has the following syntax. 

( loop-pat ) ::= ‘(loop’ (pat-var) ‘ [’ (idx) ( last-num) *]’ ( rep-pat) (tail-pat) ‘)’ 
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The arguments of a loop-pattern respectively represent an index-variable, a 
range of the index, a pattern repeated, and a pattern at the end. A range of 
index is represented with a tuple that consists of a number where index starts, 
which is called the current index and a number where index ends, which is called 
the last index. We can define comb which handles general n-combinations of the 
elements as follow. 

> (define $comb (lambda [$xs $n] 

(match-all xs (list something) 

[(loop $i [1 n] <join _ <cons $a_i ...>> _) 

(map (lambda [$i] a_i) (take n nats))]))) 

> (comb {1234} 2) 

{{1 2} {1 3} {2 3} {1 4} {2 4} {3 4» 

> (comb {1234} 3) 

{{1 2 3} {1 2 4} {1 3 4} {2 3 4}> 

We explain how above code is interpreted when n is 2. A loop-pattern is 
(loop $i [1 2] <join _<cons $a_i ...>> _). When the interpreter meets 
a loop-pattern and the current index is not greater than the last index, a loop- 
pattern returns the third argument replacing 1 . . . ’ with the loop-pattern itself. 
Therefore, the above example is evaluated to <join _<cons $a_l (loop $i 
[2 2] <join _<cons $a_i . . .» _)>>. Note that in the evaluation, the 
index-variable i is replaced with the first index, 1 in the example. Moreover, 
the current index of the extended loop-pattern proceeds to ‘2‘ from ‘1‘. That is, 
[1 2] is replaced with [2 2]. Repeating this evaluation again, we reach <join 
_<cons $a_l <join _<cons $a_2 (loop $i [3 2] <join _<cons $a_i . 
_)>>>>. When the current index is greater than the last index, a loop-pattern 
returns the fourth argument. So in the case, the loop-pattern is replaced with 
Then, we get <join _<cons $a_l <join _<cons $a_2 _>>>>. It is the 
same pattern we used in comb2. 

In the above, we omit explanation of restriction about the place of ‘ 

1 . . . ’ must be placed at the end of the second argument. For example, <cons 
<nil>> is prohibited. This restriction decides which loop-patterns a given 
‘. . . ’ belongs to, and then allows us to write nested loop-patterns. 

If we use loop-patterns in a pattern, the count of the pattern-variables in 
the pattern can change by the parameter. It is the reason why we introduced 
indexed-variables. 


3 Pattern-Matching-Oriented Programming Style 

We can redefine the well-known list library’s functions with pattern-matching, 
eq is a predefined matcher for data types on whom equality is syntactically 
defined. When the eq matcher is used, equality is checked for value-patterns. 

(define $map (lambda [$xs $fn] (match-all xs (list something) 

[<join _ <cons $x _>> (fn x)]))) 

(define $member? (lambda [$x $xs] (match xs (multiset eq) 


» 
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{[<cons ,x _> <True>] [_ <False>]}))) 


(define $delete (lambda [$x $xs] (match xs (list eq) 
{[<join $hs <cons ,x $ts>> (append hs ts)] [_ xs]}))) 

(define $take (lambda [$n $xs] (match xs (list something) 
{[(loop $i [1 n] <cons $a_i ...> _) 

(map (lambda [$i] a_i) (take n nats))]}))) 


4 Related Work 


In this section, we introduce existing studies in the field of pattern-matching. 


McBride’s symbol manipulation system McBride et al.(1970)McBride, Morrison, and Pengelly 
may be the first non-linear pattern-matching system. In his paper, there are sev¬ 
eral demonstrations to process math expressions that show the expressive power 
of non-linear patterns. However, McBride’s approach does not support pattern¬ 
matching with backtracking and only supports pattern-matching against a list 
as a collection. 

Wadler’s views [Wadler(1987) provide the way to decompose data with mul¬ 
tiple representations, by declaring transformation between each representations. 

For example, we can intuitively handle complex numbers that have cartesian 
and polar representation. Data are automatically transformed in the matching 
process. However, they treat neither multiple results of pattern-matching nor 
non-linear patterns. 

Active patterns [Erwig(1996) provide a way to decompose non-free data. 

For example, we can implement the cons pattern constructor for multiset with 
active patterns. The limitation of active patterns is that it does not support 
backtracking. Therefore, for example, we cannot write a pattern that matches 
identical pairs in a collection. 

First class patterns |Tullsen(2000) propose a sophisticated system that treats 
patterns as first class objects. First class patterns can deal with pattern¬ 
matching that has multiple results. However pattern-matching with this pro¬ 
posal also has limitation that it does not support non-linear pattern-matching. 

Functional logic programming | Antoy and Hanus(200~5)] is another approach. 

It applies unification of logic programming to pattern-matching. So it can han¬ 
dle both of non-linear patterns and backtracking. The progress by our proposal 
from this work is polymorphism of pattern-constructors by matchers, the sup¬ 
port for infinite results of pattern-matching and loop-patterns. Actually, there 
is also a difference in the mechanism behind and the way to define matchers 
between this work and our proposal though we do not focus on it in this paper. 


5 Conclusion 

The contribution of our proposal is an invention of a pattern-matching system 
with all of the following features in functional programming. Additionally, 
we set a example to write programs utilizing these features. We contribute to 
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programming language community by extending the area that we can write a 
program in more concise way. 

Non-linear patterns 

We can handle multiple occurrences of same variables in a pattern. Non¬ 
linear patterns are represented with value-patterns that allow us to write 
expressions in a pattern. 

Multiple pattern-matching results 

We can handle pattern-matching that has multiple and even infinite re¬ 
sults. This feature is necessary for pattern-matching against data types 
whose data have multiple way of decomposition. 

Polymorphism of pattern-constructors 

We can use the same pattern-constructors for similar data types. This 
feature reduces the number of names of pattern-constructors to remember. 
Just with nil, cons and join, we can express most of patterns against 
collections. 

Loop patterns 

Loop-patterns enables to express patterns that the count of the pattern- 
variables in the pattern can change by the parameter. 

We believe direct and concise representation of algorithms promotes us to 
implement really new things that we have never imagined to implement. We 
hope our work will make breakthroughs in various fields. 
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